Support proper numpy integration for ~100x performance boost by VeaaC · Pull Request #259 · heremaps/flatdata

VeaaC · 2026-04-23T08:58:28Z

flatdata-py performance: vectorized access and scalar optimization

What

Adds NumPy-based vectorized field access to flatdata-py and optimizes the scalar (element-by-element) read path. Also fixes a pre-existing bug in read_value() for unaligned 64-bit fields.

Changes

Vectorized access (`data_access.py`, `resources.py`)

read_field_vectorized(): reads a bit-packed field from all vector elements at once via NumPy, returning an ndarray. Zero-copy over the mmap'd buffer.
Vector.__getattr__("field") returns a DataFrame column for the field.
Vector.to_numpy() / to_data_frame() return all fields at once.
_VectorSlice gets the same vectorized methods.
Results are cached per vector instance via _as_numpy_2d().

Pre-computed field readers (`data_access.py`, `structure.py`)

make_field_reader(offset, width, signed) builds a specialized closure with all constants (byte offset, bit shift, mask, sign handling) pre-computed. Six variants cover the cross-product of field types.
Structure.__init_subclass__ builds a _READERS dict once per class.
__getattr__, as_dict, as_list, as_tuple, as_nparray all use _READERS.
read_value() is preserved as a thin wrapper around make_field_reader for one-off reads.

Bug fix (`data_access.py`)

read_value() for 64-bit fields at non-byte-aligned offsets could return values wider than 64 bits (Python arbitrary-precision ints). The bit mask was only applied when num_bits < 64, missing the case where offset_extra_bits > 0. Fixed by masking when num_bits < 64 or offset_extra_bits > 0.

Other

__slots__ = () added to generated Structure subclasses (generator template + 10 golden files). Reduces instance size from 72 to 48 bytes.
Vector.__iter__ uses local variable caching to avoid repeated attribute lookups.
Removed unnecessary list() on dict keys in Archive.__getattr__.
Performance tips section added to flatdata-py/README.md.
Version bump: flatdata-generator and flatdata-py both 0.4.10 → 0.4.11.
CI workflow updated to install local generator before flatdata-py (py.yml).

Performance

Measured on a vector from a test archive (5.8M elements, 20 fields, 32 bytes each):

Access pattern	Before	After
Scalar iteration (1 field)	9.7s	5.8s
Vectorized column access (1 field)	n/a	0.07s

Signed-off-by: Christian Vetter <christian.vetter@here.com>

VeaaC force-pushed the faster-py branch from 64e06ad to fec7fb6 Compare April 23, 2026 08:58

VeaaC added 5 commits April 23, 2026 13:43

Support proper numpy integration for ~100x performance boost

b94a2b3

Signed-off-by: Christian Vetter <christian.vetter@here.com>

Code review changes

694688b

Signed-off-by: Christian Vetter <christian.vetter@here.com>

Tests

4ba94a0

Signed-off-by: Christian Vetter <christian.vetter@here.com>

Review fixes

a5fcec2

Signed-off-by: Christian Vetter <christian.vetter@here.com>

__slots__ optimization

87b8f69

Signed-off-by: Christian Vetter <christian.vetter@here.com>

VeaaC force-pushed the faster-py branch from 63076a5 to ed12497 Compare April 23, 2026 11:43

Fix CI

f111afb

Signed-off-by: Christian Vetter <christian.vetter@here.com>

VeaaC force-pushed the faster-py branch from ed12497 to f111afb Compare April 23, 2026 11:45

VeaaC merged commit 70b9050 into heremaps:master Apr 23, 2026
9 checks passed

VeaaC deleted the faster-py branch April 23, 2026 13:09

VeaaC mentioned this pull request Apr 27, 2026

Python: high performance backend #8

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support proper numpy integration for ~100x performance boost#259

Support proper numpy integration for ~100x performance boost#259
VeaaC merged 6 commits intoheremaps:masterfrom
VeaaC:faster-py

VeaaC commented Apr 23, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

VeaaC commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

flatdata-py performance: vectorized access and scalar optimization

What

Changes

Vectorized access (data_access.py, resources.py)

Pre-computed field readers (data_access.py, structure.py)

Bug fix (data_access.py)

Other

Performance

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

VeaaC commented Apr 23, 2026 •

edited

Loading

Vectorized access (`data_access.py`, `resources.py`)

Pre-computed field readers (`data_access.py`, `structure.py`)

Bug fix (`data_access.py`)